Alignment Across Oriental and Indo-European Languages

ثبت نشده
چکیده

The linguistic characteristics of Oriental languages and Indo-European languages are very different. Using purely length-based algorithm could not produce high performance on aligning texts. This paper investigates the effectiveness of critical part-of-speech (POS) criterion on alignment under conditions of different search strategies and different register texts. Two metrics, recall and precision, are defined for performance evaluation. The experimental results show that the critical POSes criterion has the uniform behavior on different search strategies and on different register texts and is superior to the length-based criterion. Subject Areas: Bilingual Texts, Alignment, Texts Processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Female Persian Scholars' Resistance to "The Orientalist Construction of Persia": A Critique of James Morier's The Adventure of Hajji Baba of Ispahan

The Western fascination with Persia, intensified though it was by the nineteenth century obsession with Indo-European languages, extends back into the classical origins of European culture. As Edward Said notes, Aeschylus' The Persians demonstrates how deeply embedded in the European mind is the belief in Persia's quintessential status as an Oriental society. For many Orientalists Persia demons...

متن کامل

Dictionary Organization in Linguistic Automaton for Oriental Languages

The central problem for natural language processing (NLP) systems dealing with non-Indo-European (“Oriental”) languages is how to develop automatic dictionaries (AD) and dictionary entry (DE) schemes. The point is that the need of Oriental language industrial NLP has been felt for some time. It has acquired additional urgency with the rapid growth of business contacts between Russia and the nat...

متن کامل

A Medical Language Processor for Two Indo-European Languages

The syntax and semantics of clinical narrative across Indo-European languages are quite similar, making it possible to envison a single medical language processor that can be adapted for different European languages. The Linguistic String Project of New York University is continuing the development of its Medical Language Processor in this direction. The paper describes how the processor operat...

متن کامل

Bilingual Knowledge Acquisition from Korean-English Parallel Corpus Using Alignment

This paper snggests a method to align Korean-English parallel corpus. '1?he structural dissimilarity between Korean and Indo-European languages requires more flexible measures to evaluate the alignment candidates between the bilingual units than is used to handle the pairs of Indo-European languages. The flexible measure is intended to capture the dependency between bilingual items that can occ...

متن کامل

Building Parallel Corpora by Automatic Title Alignment

Cross-lingual semantic interoperability has drawn significant attention in recent digital library research as the digital libraries in languages other English has grown exponentially. Cross-lingual information retrieval (CLIR) across different European languages, such as English, Spanish, and French, has been widely explored; however, CLIR across European languages and Oriental languages is sti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995